2,661 research outputs found

    What entropy at the edge of chaos?

    Full text link
    Numerical experiments support the interesting conjecture that statistical methods be applicable not only to fully-chaotic systems, but also at the edge of chaos by using Tsallis' generalizations of the standard exponential and entropy. In particular, the entropy increases linearly and the sensitivity to initial conditions grows as a generalized exponential. We show that this conjecture has actually a broader validity by using a large class of deformed entropies and exponentials and the logistic map as test cases.Comment: 6 pages, 6 figures, uses ws-procs975x65.cls. Proceedings of the Workshop "Complexity, Metastability and Nonextensivity" held in Erice 20-26 July 2004. New version: minor changes (one reference added, typos corrected, positions of figures changed, some comments added

    A Petri Nets Model for Blockchain Analysis

    Full text link
    A Blockchain is a global shared infrastructure where cryptocurrency transactions among addresses are recorded, validated and made publicly available in a peer- to-peer network. To date the best known and important cryptocurrency is the bitcoin. In this paper we focus on this cryptocurrency and in particular on the modeling of the Bitcoin Blockchain by using the Petri Nets formalism. The proposed model allows us to quickly collect information about identities owning Bitcoin addresses and to recover measures and statistics on the Bitcoin network. By exploiting algebraic formalism, we reconstructed an Entities network associated to Blockchain transactions gathering together Bitcoin addresses into the single entity holding permits to manage Bitcoins held by those addresses. The model allows also to identify a set of behaviours typical of Bitcoin owners, like that of using an address only once, and to reconstruct chains for this behaviour together with the rate of firing. Our model is highly flexible and can easily be adapted to include different features of the Bitcoin crypto-currency system

    ABCDE -- Agile Block Chain Dapp Engineering

    Get PDF
    Cryptocurrencies and their foundation technology, the Blockchain, are reshaping finance and economics, allowing a decentralized approach enabling trusted applications with no trusted counterpart. More recently, the Blockchain and the programs running on it, called Smart Contracts, are also finding more and more applications in all fields requiring trust and sound certifications. Some people have come to the point of saying that the "Blockchain revolution" can be compared to that of the Internet and the Web in their early days. As a result, all software development revolving around the Blockchain technology is growing at a staggering rate. The feeling of many software engineers about such huge interest in Blockchain technologies is that of unruled and hurried software development, a sort of competition on a first-come-first-served basis which does not assure neither software quality, nor that the basic concepts of software engineering are taken into account. This paper tries to cope with this issue, proposing a software development process to gather the requirement, analyze, design, develop, test and deploy Blockchain applications. The process is based on several Agile practices, such as User Stories and iterative and incremental development based on them. However, it makes also use of more formal notations, such as some UML diagrams describing the design of the system, with additions to represent specific concepts found in Blockchain development. The method is described in good detail, and an example is given to show how it works.Comment: 26 pages, 7 figures, 8 table

    Smart contracts categorization with topic modeling techniques

    Get PDF
    One of the main advantages of the Ethereum blockchain is the possibility of developing smart contracts in a Turing complete environment. These general-purpose programs provide a higher level of security than traditional contracts and reduce other transaction costs associated with the bargaining practice. Developers use smart contracts to build their tokens and set up gambling games, crowdsales, ICO, and many others. Since the number of smart contracts inside the Ethereum blockchain is several million, it is unthinkable to check every program manually to understand its functionality. At the same time, it would be of primary importance to group sets of Smart Contracts according to their purposes and functionalities. One way to group Ethereum’s smart contracts is to use topic modeling techniques, taking advantage of the fact that many programs representing a specific topic are similar in the program structure. Starting from a dataset of 130k smart contracts, we built a Latent Dirichlet Allocation (LDA) model to spot the number of topics within our sample. Computing the coherence values for a different number of topics, we found out that the optimal number was 15. As we expected, most programs are tokens, games, crowdfunding platforms, and ICO

    Power laws in software systems

    Get PDF
    The main topic of my PhD has been the study of power laws in software systems within the perspective of describing software quality. My PhD research contributes to a recent stream of studies in software engineering, where the investigation of power laws in software systems has become widely popular in recent years, since they appear on an incredible variety of different software quantities and properties, like, for example, software metrics, software faults, refactoring, Java byte-code, module dependencies, software fractal dimension, lines of code, software packages and so on. The common presence of power laws suggests that software systems belong to the much larger category of complex systems, where typically self organization, fractality and emerging phenomena occur. Often my work involved the determination of a complex graph associated to the software system, defining the so called “complex software network”. For such complex software networks I analyzed different network metrics and I studied their relationships with software quality. In this PhD I took advantage of the theory of complex systems in order to study, to explain and sometimes to forecast properties and behavior of software systems. Thus my work involved the empirical study of many different statistical properties of software, in particular metrics, faults and refactorings, the construction and the application of statistical models for explaining such statistical properties, the implementation and the optimization of algorithms able to model their behavior, the introduction of metrics borrowed from Social Network Analysis (SNA) for describing relationships and dependencies among software modules. More specifically, my research activity regarded the followings topics: Bugs, power laws and software quality In [1] [7] [16] [20] [21] [22] module faultness and its implications on software quality are investigated. I studied data mining from CVS repositories of two large OO projects, Eclipse and Netbeans, focusing on “fixing- issue” commits, and compared static traditional approaches, like Knowledge Engineering, to dynamic approaches based on Machine Learning techniques. The work compares for the first time performances of Machine Learning (ML) techniques to automatic classify “fixing-issues” among message commits. Our study calculates precision and recall of different Machine Learning Classifiers for the correct classification of issue- reporting commits. The results show that some ML classifiers can correctly classify up to 99.9% of such commits. In [22] Java software systems are treated as complex graphs, where nodes represent a Java file - called compilation unit (CU) - and an edges represent a relations between them. The distribution of the number of bugs per CU, exhibits a power-law behavior in the tail, as well as the number of CUs influenced by a specific bug. The exam of the evolution of software metrics across different releases allows to understand how relationships among CUs metrics and CUs faultness change with time. In [1] module faultness is further discussed from a statistical perspective, using as case studies five versions of Eclipse, to show how log-normal, Double Pareto and Yule-Simon statistical distributions may fit the empirical bug distribution at least as well as the Weibull distribution proposed by Zhang. In particular, I discuss how some of these alternative distributions provide both a superior fit to empirical data and a theoretical motivation to be used for modeling the bug generation process. Further studies reported in [3] present a model based on the Yule process, able to explain the evolution of some properties of large object- oriented software systems. Four system properties related to code production of four large object-oriented software systems – Eclipse, Netbeans, JDK and Ant are analyzed. The properties analyzed, namely the naming of variables and methods, the call to methods and the inheritance hierarchies, show a power-law distribution. A software simulation allows to verify the goodness of the model, finding a very good correspondence between empirical data of subsequent software versions, and the prediction of the model presented. In [18], [19] and [23] three algorithms for an efficient implementation of the preferential attachment mechanism lying at the core of the Yule process are developed, and their efficiency in generating power- law distribution for different properties of Object Oriented (OO) software systems is discussed. Software metrics and SNA metrics In [2] [8] [13] [17] software metrics related to quality are analyzed and some metrics borrowed from the Social Network Analysis are applied to OO software graphs. In OO systems the modules are the classes, interconnected with each other by relationships like inheritance and dependency. It is possible to represent OO systems as software networks, where the classes are the network nodes and the relationships among classes are the network edges. Social Networks metrics, as for instance, the EGO metrics, allow to identify the role of each single node in the information flow through the network, being related to software modules and their dependencies. In [2] these metrics are compared with other traditional software metrics, like the Chidamber-Kemerer suite, and software graph metrics. The exam of the empirical distributions of all the metrics across the software modules of several releases of two large Java systems systematically shows fat-tails for all the metrics. Moreover, the various metric distributions look very similar and consistent across all system releases and are also very similar in both systems. Analytical distribution functions suitable for describing and studying the observed distributions are also provided. The work in [17] presents an extensive analysis of software metrics for 111 object-oriented systems written in Java. For each system, we considered 18 traditional metrics such as LOC and Chidamber and Kemerer metrics, as well as metrics derived from complex network theory and social network analysis, computed at class level. Most metrics follow a leptokurtotic distribution. Only a couple of metrics have patent normal behavior while some others are very irregular, and even bimodal. The statistics gathered allow to study and discuss the variability of metrics along different systems. In [8] a preliminary and exploratory analysis of the Eclipse subprojects is presented, using a joint application of SNA and traditional software metrics. The entire set of metrics has been summarized performing a Principal Component Analysis (PCA) and obtaining a very reduced number of independent principal components, which allow to represent the classes into a space where they show typical patterns. The preliminary results show how the joint application of traditional and network software metrics may be used to identify subprojects developed with similar functionalities and scopes. In [13] the software graphs of 96 systems of the Java Qualitas Corpus are anlyzed, parsing the source code and identifying the dependencies among classes. Twelve software metrics were analyzed, nine borrowed from Social Net- work Analysis (SNA), and three more traditional software metrics, such as Loc, Fan-in and Fan-out. The results show how the metrics can be partitioned in groups for almost the whole Java Qualitas Corpus, and that such grouping can provide insights on the topology of software networks. For two systems, Eclipse and Netbeans, we computed also the number of bugs, identifying the bugs affecting each class, and finding that some SNA metrics are highly correlated with bugs, while others are strongly anti-correlated. Software fractal dimension In [6] [12] [14] [15] the self similar structure of software networks is used to introduce the fractal dimension as a global software metric associated to software quality, at the system level and at the subproject level. In [6] the source code of various releases of two large OO Open Source (OS) Java software systems, Eclipse and Netbeans is analyzed, investigating the complexity of the whole release and of its subprojects. In all examined cases there exists a scaling region where it is possible to compute a self-similar coefficient, the fractal dimension, using “the box counting method”. Results show that this measure looks fairly related to software quality, acting as a global quality software metric. In particular, we computed the defects of each software system and we found a clear correlation among the number of defects in the system, or in a subproject, and its fractal dimension. This correlation exists across all the subprojects and also along the time evolution of the software systems, as new releases are delivered. In [14] software systems are considered as complex networks which have a self- similar structure under a length-scale transformation. On such complex software networks a self-similar coefficient is computed, also known as fractal dimension, using "the box counting method”. Several releases of the publically available Eclipse software system were analyzed, calculating the fractal dimension for twenty sub-projects, randomly chosen, for every release, as well as for each release as a whole. Our results display an overall consistency among the sub- projects and among all the analyzed releases. The study founds a very good correlation between the fractal dimension and the number of bugs for Eclipse and for twenty sub-projects. This result suggests that the fractal dimension could be considered as a global quality metric for large software systems. Works [12] and [15] propose an algorithm for computing the fractal dimension of a software network, and compare its performances with two other algorithms. Object of study are various large, object-oriented software systems. We built the associated graph for each system, also known as software network, analyzing the binary relationships (dependencies), among classes. We found that the structure of such software networks is self-similar under a length-scale transformation. The fractal dimension of these networks is computed using a Merge algorithm, first devised by the authors, a Greedy Coloring algorithm, based on the equivalence with the graph coloring problem, and a Simulated Annealing algorithm, largely used for efficiently determining minima in multi-dimensional problems. Our study examines both efficiency and accuracy, showing that the Merge algorithm is the most efficient, while the Simulated Annealing is the most accurate. The Greeding Coloring algorithm lays in between the two, having speed very close to the Merge algorithm, and accuracy comparable to the Simulated Annealing algorithm. 1.b Further research activity In [4] [9] [10] [11] the problem of software refactoring is analyzed. The study reported in [4] analyzes the effect of particular refactorings on class coupling for different releases of four Object Oriented (OO) Open Source (OS) Java software systems: Azureus, Jtopen, Jedit and Tomcat, as representative of general Java OS systems. Specifically, the “add parameter” to a method and “remove parameter” from a method refactorings, as defined according to Fowler’s dictionary, may influence class coupling changing fan-in and fan-out of classes they are applied to. The work investigates, both qualitatively and quantitatively, what is the global effect of the application of such refactorings, providing best fitting statistical distributions able to describe the changes in fan-in and fan-out couplings. A detailed analysis of the best fitting parameters and of their changes when refactoring occurs, has been performed, estimating the effect of refactoring on coupling before it is applied. Such estimates may help in determining refactoring costs and benefits . In [9] a study of the effect of fan-in and fan-out metrics is performed from the perspective of two refactorings, “add parameter to” and “remove parameter from” a method, collecting these two refactorings from multiple releases of the Tomcat open source system. Results show significant differences in the profiles of statistical distributions of fan-in and fan-out between refactored and not refactored classes. A strong over-arching theme emerged: developers seemed to focus on the refactoring of classes with relatively high fan-in and fan-out values rather than classes with high values in any one. In [10] is considered for the first time how a single refactoring modified these metric values, what happened when refactorings had been applied to a single class in unison and finally, what influence a set of refactorings had on the shape of FanIn and FanOut distributions. Results indicated that, on average, refactored classes tended to have larger FanIn and FanOut values when compared with non-refactored classes. Where evidence of multiple (different) refactorings applied to the same class was found, the net effect (in terms of FanIn and FanOut coupling values) was negligible. In [11] is shown how highly-coupled classes were more prone to refactoring, particularly through a set of ‘core’ refactorings. However, wide variations were found across systems for our chosen measures of coupling namely, fan-in and fan-out. Specific individual refactorings were also explored to gain an understanding of why these differences may have occurred. An exploration of open questions through the extraction of fifty-two of Fowler’s catalog of refactorings drawn from versions of four open-source systems is accomplished, comparing the coupling characteristics of each set of refactored classes with the corresponding set of non-refactored classes. In [7] I presented some preliminary studies also on the relationships about Micro- patterns, more specifically anti-patterns, and software quality, while in [5] and [21] I analyzed the role of Agile methodologies in software production and the relationships with software quality and the presence of bugs
    • 

    corecore